Automated Construction of Database Interfaces: Intergrating Statistical and Relational Learning for Semantic Parsing

نویسندگان

  • Lappoon R. Tang
  • Raymond J. Mooney
چکیده

The development of natural language interfaces (NLI's) for databases has been a challenging problem in natural language processing (NLP) since the 1970's. The need for NLI's has become more pronounced due to the widespread access to complex databases now available through the Internet. A challenging problem for empirical NLP is the automated acquisition of NLI's from training examples. We present a method for integrating statistical and relational learning techniques for this task which exploits the strength of both approaches. Experimental results from three different domains suggest that such an approach is more robust than a previous purely logicbased approach. 1 I n t r o d u c t i o n We use the term semantic parsing to refer to the process of mapping a natural language sentence to a structured meaning representation. One interesting application of semantic parsing is building natural language interfaces for online databases. The need for such applications is growing since when information is delivered through the Internet, most users do not know the underlying database access language. An example of such an interface that we have developed is shown in Figure 1. Traditional (rationalist) approaches to constructing database interfaces require an expert to hand-craft an appropriate semantic parser (Woods, 1970; Hendrix et al., 1978). However, such hand-crafted parsers are time consllming to develop and suffer from problems with robustness and incompleteness even for domain specific applications. Nevertheless, very little research in empirical NLP has explored the task of automatically acquiring such interfaces from annotated training examples. The only exceptions of which we are aware axe a statistical approach to mapping airline-information queries into SQL presented in (Miller et al., 1996), a probabilistic decision-tree method for the same task described in (Kuhn and De Mori, 1995), and an approach using relational learning (a.k.a. inductive logic programming, ILP) to learn a logic-based semantic parser described in (Zelle and Mooney, 1996). The existing empirical systems for this task employ either a purely logical or purely statistical approach. The former uses a deterministic parser, which can suffer from some of the same robustness problems as rationalist methods. The latter constructs a probabilistic grammar, which requires supplying a sytactic parse tree as well as a semantic representation for each training sentence, and requires hand-crafting a small set of contextual features on which to condition the parameters of the model. Combining relational and statistical approaches can overcome the need to supply parse-trees and hand-crafted features while retaining the robustness of statistical parsing. The current work is based on the CHILL logic-based parser-acquisition framework (Zelle and Mooney, 1996), retaining access to the complete parse state for making decisions, but building a probabilistic relational model that allows for statistical parsing2 O v e r v i e w o f t h e A p p r o a c h This section reviews our overall approach using an interface developed for a U.S. Geography database (Geoquery) as a sample application (ZeUe and Mooney, 1996) which is available on the Web (see hl:tp://gvg, c s . u t e z a s , edu/users/n~./geo .html). 2.1 S e m a n t i c R e p r e s e n t a t i o n First-order logic is used as a semantic representation language. CHILL has also been applied to a restaurant database in which the logical form resembles SQL, and is translated

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automated Construction of Database Interfaces: Integrating Statistical and Relational Learning for Semantic Parsing

The development of natural language interfaces (NLI's) for databases has been a challenging problem in natural language processing (NLP) since the 1970's. The need for NLI's has become more pronounced due to the widespread access to complex databases now available through the Internet. A challenging problem for empirical NLP is the automated acquisition of NLI's from training examples. We prese...

متن کامل

Integrating Statistical and Relational Learning for Semantic Parsing: Applications to Learning Natural Language Interfaces for Databases

The development of natural language interfaces (NLIs) for databases has been an interesting problem in natural language processing since the 70's. The need for NLIs has become more pronounced given the widespread access to complex databases now available through the Internet. However, such systems are diicult to build and must be tailored to each application. A current research topic involves u...

متن کامل

Learning for Semantic Parsing

Semantic parsing is the task of mapping a natural language sentence into a complete, formal meaning representation. Over the past decade, we have developed a number of machine learning methods for inducing semantic parsers by training on a corpus of sentences paired with their meaning representations in a specified formal language. We have demonstrated these methods on the automated constructio...

متن کامل

برچسب‌زنی خودکار نقش‌های معنایی در جملات فارسی به کمک درخت‌های وابستگی

Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...

متن کامل

Simplifying Syntactic and Semantic Parsing of NL Based Queries in Advanced Application Domains

The paper aims at presenting a natural (sub)language based querying approach (MDDQL) for SQL (relational, object-relational) databases, which relies on an ontology driven, interactive query construction mechanism. This guides the user to the construction of queries that are semantically compliant with the application domain semantics. To this extent, syntactic and semantic parsing of a query is...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000